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ABSTRACT 

The  objectives  of  this  thesis  were  to  design  a  method 
for  evaluation  of  the  diagnostic  potential  of  available 
indicators  of  coronary  heart  disease  (CHD)  and  to  present  a 
systematic,  quantitative  procedure  for  aiding  in  its  diag- 
nosis.  A  sample  space  of  patients  was  divided  into  two 
mutually  exclusive  groups,  those  with  angiographic  evidence 
of  CHD,  and  those  with  no  CHD.   Active  duty  or  retired 
military  men  between  the  ages  of  30  and  67  years  constituted 
the  sample  space.   Tests  and  risk  factors  were  available  in 
the  medical  literature  that  a  doctor  could  view  as  an  indi- 
cator or  contraindicator  of  CHD.   A  vector  of  these  possible 
indicators  was  established  and  the  diseased  group  was  com- 
pared to  the  non-diseased  group  in  an  effort  to  evaluate 
the  diagnostic  potential  of  the  indicators.   This  was  don 
by  discriminant  analysis  in  conjunction  with  a  Bayesian 
method  of  weighting  the  importance  of  test  results.   The 
important  indicators  were  then  used  to  formulate  a  model  for 
diagnosing  CHD  based  on  a  Bayes ?  decision  technique. 
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I.   INTRODUCTION 

Heart  attacks  resulting  from  coronary  heart  disease 
(CHD)  cause  more  deaths  each  year  than  cancer,  strokes,  and 
accidents  combined.   These  deaths  also  include  a  broader 
spectrum  of  the  population  than  in  previous  years.   In  the 
last  century,  heart  disease  was  viewed  as  a  natural  result 
of  growing  old.   But  with  the  transition  from  a  rural  to 
an  urban  society,  and  the  inherent  traits  of  tension,  rich 
diet,  and  lack  of  exercise,  the  propensity  for  heart  disease 
has  increased.   This  increase  can  be  seen  in  the  steady 
rise  in  the  number  of  heart  attacks  among  men  over  the  past 
20  years.   The  American  Heart  Association  reported  that  of 
the  675,000  deaths  from  CHD  expected  during  the  past  year, 
176,000  would  have  been  men  and  women  under  the  age  of  65 
[Ref.  16]. 

Medical  capabilities  have  greatly  increased,  giving 
coronary  heart  disease  patients  a  greater  probability  of 
survival  once  they  are  under  medical  care,  but  since  over 
half  of  those  who  die  never  reach  a  hospital,  the  problem 
of  predicting  coronary  heart  disease  becomes  very  important. 
This  diagnostic  problem  gains  additional  importance  because 
of  the  lack  of  a  proven  method  for  the  treatment  of  CHD  in 
its  advanced  stages.   Furthermore,  there  is  an  increased 
presence  of  asymptomatic  CHD  that  may  go  undetected  with 
present  diagnostic  criteria. 


In  this  study  an  attempt  has  been  made  to  consolidate 
a  spectrum  of  risk  factors  that  can  be  incorporated  into 
diagnostic  procedures  for  CHD.   Specifically,  the  objectives 
were  to  design  a  method  for  evaluation  of  the  diagnostic 
potential  of  available  indicators  of  CHD  and  to  present  a 
systematic,  quantitative  procedure  for  aiding  in  its  diagnosis 

A  sample  space  of  patients  was  divided  into  two  mutually 
exclusive  groups,  those  with  angiographic  evidence  of  CHD, 
and  those  with  no  CHD.   Active  duty  or  retired  military  men 
between  the  ages  of  30  and  67  years  constituted  the  sample 
space.   There  were  certain  tests  and  risk  factors  available 
in  the  medical  literature  that  a  doctor  could  view  as  an 
indicator  or  contraindicator  of  the  disease.   Having 
established  a  vector  of  these  possible  indicators,  the 
diseased  group  was  compared  to  the  nondiseased  group  in  an 
effort  to  evaluate  the  diagnostic  potential  of  the  indicators. 
This  was  done  by  discriminant  analysis  in  conjunction  with 
a  Bayesian  method  of  weighting  the  importance  of  test 
results.   The  important  indicators  were  then  used  to 
formulate  a  model  for  diagnosing  CHD  based  on  a  Bayes • 
decision  technique. 


II.   BACKGROUND 

Probabilistic  and  computer  aided  designs  to  aid  decision 
makers  in  medical  diagnosis  have  been  a  promising  area  of 
research  for  some  time,  and  an  abundant  literature  on  these 
subjects  exists  [Refs.  8,  10].   They  have  had  little  impact 
on  the  practice  of  medicine,  however,  with  several  charac- 
teristic reasons  being  given.   Among  them  may  be  mentioned 
insufficient  data  bases  because  of  the  poor  quality,  lack 
of  uniformity,  or  inaccessability  of  medical  records.   In 
addition,  there  appears  to  be  a  lack  of  understanding  and 
interface  between  the  medical  profession  and  those  who 
would  apply  probabilistic  procedures  to  aid  the  medical 
decision  makers. 

Recent  years  have  shown  an  increase  in  research  efforts 
aimed  at  the  prevention  and  diagnosis  of  CHD.   At  the 
present  time,  however,  coronary  arteriography  appears  to 
be  the  only  completely  definitive  test  for  the  disease 
[Refs.  4,  12].   Unfortunately,  this  is  a  costly  surgical 
procedure  that  requires  hospitalization  and  involves 
definite  mortality  and  morbidity  factors,  depending  on  the 
age  and  health  of  the  patient.   Arteriography  is  currently 
only  available  at  large  medical  centers  because  of  the 
equipment  and  expertise  required. 

Some  diagnostic  models  for  CHD  tend  to  consider  only 
symptomatic  patients,  usually  those  with  typical  angina. 


This  omits  many  subjects  who  are  asymptomatic,  a  portion  of 
which  may  be  suffering  from  silent  heart  disease. 

The  medical  literature  cites  commonly  accepted  indica- 
tors for  CHD.   Widely  used  indicators  cited  are  history  of 
ischemic  episodes,  age,  total  cholesterol,  triglycerides, 
resting  EKG ,  smoking,  and  family  history  [Refs.  4,  12,  16]. 
Less  commonly  used  indicators  that  are  also  cited  are  race, 
blood  type,  and  blood  pressure  [Refs.  4,  9].   In  addition, 
the  exercise  test  has  recently  gained  widespread  acceptance 
as  a  good  CHD  indicator  [Refs.  1,  6].   The  relative  impor- 
tance of  this  test  in  conjunction  with  other  indicators 
has  not  yet  been  thoroughly  investigated. 

It  seems  appropriate  that  a  diagnostic  model  for 
predicting  CHD  should  investigate  the  potential  of  an 
exhaustive  list  of  indicators  and  tests  for  the  disease. 
This  diagnostic  model  should  also  reduce  the  subjectivity 
in  the  decision  making  of  the  doctor  by  increasing  the 
amount  of  objective  evidence  through  the  appropriate  indi- 
cators and  tests. 


III.   DESCRIPTIVE  MODEL 

The  flow  of  patients  to  a  cardiac  clinic  is  similar 
to  the  input  of  any  other  specialty  clinic.   A  patient  may 
be  referred  to  the  cardiologist  by  another  doctor  based 
on  the  results  of  a  physical  examination  or,  if  a  person 
believes  that  he  is  suffering  from  a  cardiac  or  cardiac- 
related  illness,  he  may  voluntarily  seek  the  advice  of  the 
specialist  directly.   In  either  case,  by  the  time  a  patient 
is  admitted  to  the  cardiologist's  office,  there  is  already 
certain  data  on  him  that  is  available  to  the  physician 
without  specified  testing.   From  that  point  on,  however, 
the  diagnosis  of  a  possible  heart  disease  is  a  function  of 
the  doctor's  ability  to  assign  relative  importance  to  the 
appropriate  indicators.   Costs  of  associated  testing,  the 
procedures  available,  the  patient,  and  the  patient's  health 
may  also  have  a  bearing  on  the  doctor's  ability  to  diagnose 
correctly. 

The  cardiologist  then  may  be  viewed  as  a  decision  maker 
who,  for  each  patient,  receives  an  amount  of  initial  infor- 
mation I   from  which  he  initiates  a  sequence  of  decisions, 
gaining  additional  information  I  '  as  a  result  of  testing. 
Figure  1  shows  a  schematic  of  these  decision  processes. 
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FIGURE  1 
DECISION  PROCESSES  OF  CARDIOLOGIST 


Diagnosis 


Outcomes 


As  an  illustration  of  the  concepts  implied  in  Figure  1, 

consider  that  a  patient  is  referred  to  the  cardiologist 

because  he  has  symptoms  of  CHD.   At  decision  node  D   the 

doctor  evaluates  the  information  he  has  available.   Usually 

this  is  information  readily  available  in  the  patient's 

medical  record.   Based  on  this  information,  the  doctor  has 

two  choices  at  D  ,  diagnosis  of  the  patient  or  requesting 

additional  testing..   If,  for  example,  the  doctor  chooses  t  > 

perform  a  test,  decision  node  D,  represents  the  choice  the 

doctor  must  make  from  the  clinical  tests  available.   Having 

made  the  choice,  I  '  represents  the  information  that  results 

from  the  outcome  of  the  test.   The  doctor  is  again  faced 

with  the  decision  to  be  made  at  D  ,  but  he  now  has  the  new 

o 

information  I  •  which  reduces  the  chance  of  an  incorrect 
o 

diagnosis . 

A  summary  is  presented  in  Table  1  that  shows  the  possible 
path  of  a  patient  through  a  diagnostic  sequence. 


TABLE  1 
PATIENT  ADMITTED  TO  THE  CARDIOLOGIST 


Race,  Sex,  Age,  Height, 

Weight,  Blood  Pressure, 

Blood  Type,  Family  History  AVAILABLE 

of  Heart  Disease,  Smoking  INFORMATION    (A) 

History,  History  of  Ischemic  (I  ) 

Episodes 


FURTHER  TESTING  SPECIFIED  BY  CARDIOLOGIST 


Resting  EKG 

Exercise  EKG  CLINICAL 

Triglycerides  TESTS        (fi) 

Cholesterol  (I  M 

Angiogram  *•  o  J 


This  summary  does  not  dictate  a  specified  sequence  of  tests 
or  weightings  of  relative  importance.   The  information  in 
(A)  is  data  available  (facts  about  the  patient)  that  are 
easily  obtained  without  testing.   Tests  in  (B)  require 
expert  judgment  or-  clinical  procedures  and,  again,  are  not 
ordered  in  any  sequence  of  importance.   In  practice,  not 
all  of  the  listed  indicators  are  used  for  decision  making. 
Some  may  be  considered  by  a  particular  doctor  to  be  unim- 
portant.  It  is  also  difficult  to  assign  subjective  proba- 
bilities to  some  of  the  indicators  about  which  little  is 
known.   Furthermore,  it  is  impractical  to  correlate  the 
contributions  of  a  large  number  of  indicators  without  some 
typC.of  objective  model. 


IV.   QUANTITATIVE  METHODS 

In  general,  there  are  two  approaches  to  medical  decision 
problems.   The  first  is  to  develop  and  perfect  a  model  that 
predicts  as  well  as  or  better  than  a  physician.   The  second 
approach  consists  of  improving  ways  to  aggregate,  weight, 
and  use  information  available  to  the  physician  so  that  his 
personal  diagnosis  will  be  conducted  from  a  substantially 
sounder  base.   This  latter  approach,  which  is  commonly 
called  "bootstrapping"  [Ref.  10]  was  the  one  selected  for 
this  study. 

A  set  of  CHD  indicators  was  identified  and  evaluated 
experimentally  using  discriminant  analysis.   A  proposed 
method  of  assigning  weighting  factors  based  on  the  "posterior 
odds"  of  the  various  indicator  levels  was  incorporated  into 
the  analysis.   These  results  were  then  integrated  into  a 
Bayesian  diagnostic  model. 

A.   INDICATORS  AND  WEIGHTING  FACTORS 

At  decision  node  D,  of  Figure  1,  the  doctor  must  decide 
what  test  to  use  next  in  his  evaluation  of  the  patient.   To 
do  this  he  must  have  a  knowledge  of  what  indicators  of  CHD 
have  been  evaluated  and  the  amount  of  additional  information, 
I  ',  he  can  expect  to  obtain  from  these  indicators.   Compli- 
cating the  doctor's  evaluation  is  the  division  of  the 
indicators  into  two  types,  qualitative  and  quantitative. 
The  quantitative  indicators  are  tests  in  which  the  outcome 
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is  represented  on  an  acceptable  numerical  scale.   Of  the 
indicators  used  in  this  paper,  only  triglycerides,  choles- 
terol, age,  and  blood  pressure  were  quantitative  variables. 
The  other  indicators  shown  in  Table  1  (except  height  and 
weight  which  were  not  used)  have  results  which  have  no 
numerical  scale  and  must  be  interpreted  qualitatively. 
For  example,  the  indicator  called  history  of  ischemic 
episodes  requires  the  patient  to  verbalize  his  history  of 
chest  pain.   Also  included  in  the  category  of  qualitative 
indicators  are  tests  in  which  the  result  is  numerical  but 
lacks  meaning  unless  expressed  in  qualitative  terms.   The 
exercise  EKG  result,  for  example,  is  in  millimeters  of 
depression  (or  elevation)  of  the  S-T  segment,  but  is  inter- 
preted in  terms  of  being  positive  or  negative. 

As  pointed  out  previously,  these  indicators  and  their 
relative  merit  were  determined  from  clinical  judgment  and 
varied  among  cardiologists.   In  addition,  the  relative  impor 
tance  of  various  outcomes  of  any  specific  test  also  varied 
among  doctors.   To  alleviate  these  problems,  a  two-step 
procedure  was  used.   First,  the  outcomes  of  the  qualitative 
tests  were  assigned  weighting  factors  using  Bayes '  Theorem. 
Second,  the  qualitative  variables  and  quantitative  variables 
were  integrated  into  a  relative  ranking  using  a  stepwise 
discriminant  analysis  computer  routine  [Ref.  11]. 

Consider  a  particular  qualitative  variable  i  for  which 
P(ti-|D)  is  the  conditional  probability  of  outcome  j 
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given  a  patient  has  CHD.   The  posterior  probability  of  CHD 
(i.e.,  in  light  of  this  information)  is 


P(t. . |D)P(D) 

P(D|t..)  =  iJ : —         (1) 

^     P(t   |D)P(D)  +  P(t   |D)P(D) 


where  P(D)  is  the  presumably  known  prior  probability  of  CHD 
Each  of  these  probabilities  on  the  right  hand  side  of 
equation  (1)  can  be  estimated  from  past  data.   The  results 
are  a  vector  of  values  for  the  outcomes  of  a  specific  test 
which  could  then  be  used  with  the  outcomes  of  other  tests 
in  a  stepwise  discriminant  analysis  computer  routine.   How- 
ever, in  order  to  give  more  meaning  to  the  weighting  fac- 
tors, w. . ,  they  were  normalized  using 


P(D|t. -) 

7^ (2. 

lin  {P(D  t.v)} 


^    m: 


where  it  was  arbitrarily  decided  to  use  the  minimum  outcome 
in  order  to  show  increasing  likelihood  of  disease  as  the 
value  of  the  weighting  factor  increased. 

Consider  the  following  simple  example  to  illustrate  the 
procedure  for  computing  weighting  factors.   Suppose  it  is 
desirable  to  find  weighting  factors  for  the  qualitative 
variable  "race"  (i  =  R)  which  for  the  purpose  of  illustration, 
has  two  outcomes:   NEGRO  (j  =  1)  and  CAUCASIAN  (j  =  2). 
Suppose  further  that  the  prior  distribution  of  CHD  is 
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P(D)  =  0.1  and  data  reveals  that  P(tR1|D)  =0.2  and 

P(tR1 |D)  =  0.4.   It  then  follows  from  equations  (1)  and  (2) 

that  the  weighting  factors  are  wR,  =  1.0  and  wR2  =  2.46. 

This  method  of  computing  the  weighting  factors  {w..:i=l, 
. . . ,n; j=l, . . . ,m}  provides  a  consistent  means  of  assigning 
scores  to  each  of  the  qualitative  variables.   This  was  done 
for  a  particular  set  of  indicators  examined  in  this  study 
and  the  results  are  given  in  Section  VI.   Stepwise  linear 
discriminant  analysis  [Ref.  11]  could,  at  this  point,  be 


used  to  develop  a  linear  prediction  function  L  =   v  \  X 

i=l  *  i 

where  X  is  the  set  of  all  test  variables  (quantitative  and 
qualitative) ,  A  is  the  set  of  all  coefficients  assigned  by 
the  computer  routine,  and  m  is  the  number  of  tests.   Maha- 
lanobis  distance  could  then  be  used  as  the  discrimination 
criterion. 

Cohn  [Ref.  4]  used  this  type  of  linear  discriminant 
analysis  in  its  predictive  role  in  a  medical  decision  con- 
text.  Use  of  discriminant  analysis  for  prediction  was 
discarded  in  this  paper  for  two  reasons.   The  technique  is 
a  valid  one  when  the  underlying  distributions  of  the  random 
variables  of  the  two  samples  (in  this  case,  the  test  results) 
are  distributed  normally  with  equal  covariance  matrices  (a 
linearity  assumption) .   A  preliminary  investigation  indicated 
that  the  variance  of  the  test  results  in  the  two  samples  did 
not  appear  to  be  equal.   Additionally,  the  normality 
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assumption  did  not  appear  to  be  valid  in  this  application. 
The  test  results  had  a  combination  of  binomial,  multi- 
nomial, and  approximately  normal  distributions.   Considera- 
tion of  all  distributions  as  normal  did  not  have  a  sound 
theoretical  basis. 

The  actual  purpose  of  conducting  this  portion  of  the 
analysis  was  to  identify  the  relative  importance  among  the 
variables.   This  was  accomplished  by  ordering  the  resulting 
F-statistics  associated  with  the  coefficients  (X's)  of  the 
variables  (X's).   The  F-statistic  is  the  ratio  of  the  vari- 
ability of  the  means  of  the  individual  test  results  in 
each  sample  to  the  pooled  variance  of  the  test  results.   F 
will  be  large  when  there  is  a  large  difference  between  the 
mean  results  of  a  test  in  the  CHD  and  the  no  CHD  groups. 
Likewise,  the  smaller  the  F,  the  closer  together  are  the 
mean  results  for  a  particular  test  in  the  CHD  and  no  CHD 
groups.   Thus,  an  ordering  of  these  computed  F-statistics 
from  largest  to  smallest  may  be  considered  an  ordinal 
ranking  of  the  diagnostic  power  of  the  various  indicators. 

B.   BAYESIAN  DIAGNOSTIC  MODEL 

The  foregoing  procedure,  of  Section  IV. A. ,  for  determining 
the  relative  diagnostic  power  of  the  available  tests  of 
indicators  provides  criteria  for  the  cardiologist  to  select 
appropriate  tests  at  decision  node  D,  in  Figure  1.   A 
Bayesian  method  for  quantifying  the  information  I   and 
additional  information  I  '  is  now  presented.- 
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The  development  of  this  model  was  based  on  two  major 
assumptions.   First,  it  was  assumed  that  patients  being 
tested  either  had  CHD  or  did  not  have  CHD.   Thus,  the  case 
of  a  patient  having  multiple  diseases  was  excluded  here. 
The  second  assumption  was  that  the  data,  on  both  qualita- 
tive and  quantitative  variables  were  conditionally  inde- 
pendent . 

Let 

P(D.)  =  apriori  probability  of  CHD  (D,)  ,  or  no  CHD  (D?^) 

P(D. IS. , . . . ,S  )    a  posterior  probability  of  D.  given 

l'ln      r  r  J  1  b 

symptoms,  or  indicator  levels,  S.,...,S  . 


P(S,,...,S  p.)  =  conditional  probability  of  sympt 


oms 


S,  ,  . . . ,S   given  D- . 
1     *  n        1 


The  first  assumption  merely  requires  that  P(D.)=P(D-i) 
or  P(D2)  =  P(D,).  The  second  assumption,  in  terms  of  the 
above  notation,  says  that 


n 


P(S,  ,.  .  .  ,S  p.)  =   n   PfS.  p.) 


(3) 


It  then  follows  from  Bayes '  Theorem  that 

n 


P(D.  |S.  ,.. . ,S  )  = 


•  P(D.)   n   P(S.  p.) 


i' -!»•••»  n'    2 


E   P(S1  ,...  ,S  p.)P(D-) 
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which  is  in  terms  that  can  be  calculated  using  subjective 
probabilities  (doctor's  medical  opinions)  and  frequentistic 
procedures  [Ref .  8] . 

The  majority  of  the  conditional  probabilities  were 
calculated  using  frequentistic  procedures.  Subjective 
probabilities  were  used  when  the  data  base  was  insufficient. 

a.  y^ 

In  cases  where  a  patient  was  missing  the  j  symptom  on  his 
medical  records  or  the  patient  was  unable  to  take  the  test, 
the  conditional  probabilities  P(S.|D,)  and  P(S. |D?)  were 

J    J.  J    Z 

set  equal  to  .5  (i.e.,  P(S.|D1)  and  P(S.|D  )  were  equally 
likely  and  thus  had  no  influence  on  the  associated  proba- 
bilities) . 

The  Bayesian  diagnostic  model  was  developed  because  it 
provided  several  distinct  advantages  over  general  discrimi- 
nant analysis  techniques  commonly  used  for  medical  decision 
making.   The  first  advantage  was  the  use  of  subjective 
apriori  probabilities.   Each  doctor  has  his  own  feelings 
and  experience  concerning  the  probability  of  CHD  in  a  patient 
The  second  advantage  was  that  the  Bayesian  model  is  self- 
updating.   After  each  patient  has  been  diagnosed,  his  charac- 
teristics can  be  easily  added  to  the  data  providing  new 
apriori  probabilities.   This  allows  the  doctor  to  see  trends 
that  may  develop,  providing  the  stimulus  for  research  in 
these  areas.   The  data  base  is  continuously  enlarged  in  this 
manner,  improving  the  diagnostic  accuracy  of  the  model.   The 
third  advantage  is  that  CHD  is  only  a  small  part  of  the 
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diagnostic  problem  facing  the  doctor.   The  Bayesian  approach 
allows  for  the  expansion  of  the  hypothesis.   In  the  present 
model  only  one  hypothesis  is  treated,  no  CHD  or  CHD .   How- 
ever, this  could  easily  be  expanded  to  no  disease,  CHD, 
liver  disease,  etc.   An  important  aspect  of  this  is  that  as 
the  number  of  data  points  in  the  data  vector  and  the  number 
of  hypotheses  are  increased,  the  accuracy  of  the  model 
improves . 
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V.   CLINICAL  TESTS  AND  OBSERVATIONS 

Data  was  derived  from  three  sources.   The  first  source 
was  generated  by  testing  a  sample  of  individuals  undergoing 
routine  physical  examinations  at  Fort  Ord  Army  Hospital. 
A  collection  sheet  was  developed  to  record  the  data  that 
was  simple  yet  comprehensive  enough  to  see  if  trends 
developed  in  areas  not  considered  important  in  the  initial 
analysis  (See  Appendix  A) . 

The  second  source  of  data  was  the  medical  records  at 
Letterman  General  Hospital,  San  Francisco.   A  data  sheet 
similar  to  that  of  the  Fort  Ord  sample  was  used.   However, 
several  problem  areas  were  encountered.   The  first  was  the 
problem  of  definition  and  interpretation.   Many  records 
showed  information  such  as  "positive"  family  history  with 
no  explanation  of  what  the  doctor's  opinion  was  based  on. 
Others  had  entries  such  as  "30  pack  year  history"  of 
smoking.   This  type  of  data  does  not  differentiate  between 
two  packs  per  day  for  15  years  or  three  packs  per  day  for 
10  years.   Since  intensity  of  smoking  may  be  an  important 
variable,  much  valuable  data  were  lost.   Another  problem  in 
this  area  was  the  omission  of  data  that  were  assumed  to  be 
normal.   If  a  patient's  test  result  was  abnormal,  the 
result  was  noted  in  the  patient's  record.   (However,  if 
nothing  was  noted,  it  was  not  clear  whether  the  test  result 
was  normal  or  that  the  result  was  omitted.)   It  is  clear 
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that  personalities  become  an  important  factor  in  the  writing 
and  in  the  reading  of  medical  records.   However,  it  is  felt 
that  as  more  records  are  automated  these  problems  will  be 
greatly  reduced. 

The  problem  of  missing  data  was  the  major  obstacle 
encountered  from  the  CHD  population.   The  majority  of  the 
patients  did  not  have  all  the  test  results  in  their  files. 
The  only  solution  to  this  problem  is  to  increase  the  sample 
size  so  that  patients  with  missing  data  can  be  removed  from 
the  sample.   But  since  one  of  the  major  objectives  of  this 
paper  was  to  develop  a  method,  the  missing  data  problem  will 
not  be  considered  within  this  framework.   For  information 
concerning  decision  making  with  missing  data,  see  Ref.  4. 

The  third  source  of  data  was  the  medical  literature. 
This  was  used  to  establish  apriori  probabilities  of  CHD 
when  it  was  felt  that  the  experimental  sample  was  too  small, 
making  the  sample  probabilities  very  sensitive  to  error 
[Ref.  7]. 

The  partitioning  of  the  sample  space  into  two  parts, 
CHD  and  no  CHD,  implied  that  the  subject  in  the  healthy 
group  was  not  suffering  from  any  disease,  and  that  a  sub- 
ject in  the  CHD  group  was  suffering  from  CHD  only.   Other 
diseases  may  have  adversely  affected  the  test  results  of 
either  group.   In  the  formation  of  the  sample,  care  was 
taken  to  eliminate  all  subjects  that  had  other  diseases. 

In  the  determination  of  positive  or  negative  family 
history,  the  age  of  65  was  considered  the  cut-off.   If  a 
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blood  relative  had  CUD  prior  to  age  65,  the  result  was 
positive.   Although  this  cut-off  was  arbitrary,  it  was  the 
one  most  consistent  with  the  available  literature.   It 
can  be  easily  changed,  however,  if  another  cut-off  is 
desired. 

When  checking  for  chest  pain,  the  existence  of  any 
chest  pain  that  was  not  categorized  as  angina  was  listed 
as  undetermined  origin  since  none  of  the  subjects  were 
known  to  have  diseases  which  might  explain  the  pain. 

The  reading  of  the  resting  EKG  was  done  by  a  cardiologist 
whose  experience  and  subjective  opinions  must  be  considered 
an  important  part  of  the  data. 
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IV.   SENSITIVITY 

Sensitivity  analysis  was  conducted  in  the  following 
areas : 

1.   The  effect  of  weighting  factors  on  the  ordinal 
ranking  of  the  qualitative  indicators  was  investigated. 
Table  2  shows  how  changes  in  weighting  factors  proved  to 
markedly  influence  the  diagnostic  ordering  of  the  indica- 
tors shown  in  Table  3. 

TABLE  2 

Sample  Clinical 
Bayes '  Judgment 

Test Weighting  Factor   Weighting  Factor 

Blood  Type 

A  1.3  2 

Other  1  1 

Family  History 

Positive  1.3  2 

Negative  1  1 

Smoking  History  (per  day) 

Non-smokers  1  1 

Less  than  1/2  pack  4.6  2 

About  1  pack  4.5  3 

Greater  than  1  pack  6.3  4 

History  of  Ischemic  Episodes 

None  1  1 

Chest  pain  8  2 

Typical  angina  335  3 
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TABLE  2  (Continued) 

Resting  EKG 

Normal                      1  1 

Other                       4  2 

ST-T  abnormalities          20  3 

Pathologic  Q-waves          22.5  4 

Race 

Caucasian                   6.5  1 

Negro                      1  2 

Mongolian                   1.5  3 

Exercise  EKG 

Normal                     1  1 

ST  depression  <  1mm         25  .  2 

ST  depression  >^  1mm        150  3 

All  other  indicators  were  quantitative.   The  following 
ordering  of  indicators  and  their  associated  F-statistics 
resulted  (Table  3) : 

TABLE  3 

Bayesian  Weighting  Procedure 

History  of  Ischemic  Episodes  97.5709 

Exercise  EKG  5.3225 

Age  3.2315 

Resting  EKG  2.6744 
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TABLE  3  (Continued) 

Blood  Type  2.6532 

Cholesterol  2.2091 

Density  2.0640 

Cigarette  Smoking  1.8119 

Systolic  Blood  Pressure  .9659 

Family  History  .1607 

Triglycerides  .0485 

Diastolic  Blood  Pressure  and  Race  were  omitted  because  of 

an  insignificant  F  value  for  this  particular  sample. 

Sample  Clinical  Weighting  Procedure 

Exercise  EKG  13.0179 

History  of  Ischemic  Episodes  7.0552 

Density  2.3373 

Blood  Type  2.2919 

Cholesterol  2.1914 

Systolic  Blood  Pressure  2.1383 

Resting  EKG  1.7011 

Cigarette  Smoking  1.5936 

Family  History  1.1054 

Diastolic  Blood  Pressure  .2963 

Triglycerides  .2573 

Age  .0243 

Race  .0169 
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2.   Diagnostic  accuracy  was  investigated  by  varying  the 

prior  probability  of  disease,  P(D),  and  assuming  P(D|S,,... 

S  )  >  0.5  indicated  CHD.   These  values  for  Table  4  were 
n' 

determined  from  patients  having  eight  or  more  test  results. 


TABLE  4 

P  (D) False  Negatives**    False  Positives*** 

.04  9/50  0/52 

.10  6/50  0/52 

.20  5/50  0/52 

.30  5/50  1/52 

.40  3/50  1/52 

.50  2/50  1/52 

**  False  Negative  =   patient  has  CHD  but  is  diagnosed  as 
not  having  CHD. 

***  False  Positive  =  patient  does  not  have  CHD  but  is 
diagnosed  as  having  CHD. 

3.   After  the  model  had  been  developed  and  the  condi- 
tional probabilities  had  been  determined,  data  on  CHD 
patients  were  obtained  from  Walter  Reed  Hospital.   Using  the 
originally  determined  probabilities,  these  patients  were 
tested  with  the  Bayes'  diagnostic  model  and  12  out  of  14 
were  correctly  diagnosed  as  having  CHD.   Again,  a  P(D|S-,..., 

S  )  >  0.5  indicated  CHD. 
n 

The  Walter  Reed  patients  were  then  added  to  the  original 
sample  to  update  the  prior  probability  of  disease.   The 
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changes  in  the  prior  probabilities  were  so  small  that  they 
had  no  effect  on  the  diagnostic  results. 

4.   Diagnostic  accuracy  was  investigated  by  varying 
that  probability  above  which  CHD  would  be  indicated 
(Table  5) : 

TABLE  5 

P(D  S,,...,S  )      False  Negatives      False  Positives 

.1  3/58  1/52 

.2  4/58  0/52 

.3  8/58  0/52 

.4  9/58  0/52 

.5  9/58       .  0/52 

.6  9/58  0/52 

.7  9/58  0/52 

.8  11/58  0/52 

.9  15/58  0/52 
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VII.   RESULTS  AND  CONCLUSIONS 

As  previously  stated  in  Section  I,  the  objectives  of 
the  study  were  to  design  a  method  for  the  evaluation  of  the 
diagnostic  potential  of  available  indicators  of  CHD  and  to 
present  a  systematic,  quantitative  procedure  for  aiding  in 
its  diagnosis.   The  indicators  of  CHD  were  investigated  by 
comparing  specific  test  results  from  a  CHD  sample  and  a 
healthy  sample  with  no  CHD. 

The  stepwise  discriminant  analysis,  as  presented  in 
Section  IV. A. ,  using  all  variables  was  performed  on  a  CHD 
sample  size  of  106  compared  to  a  no  CHD  sample  size  of  56. 
The  weighting  factors  were  determined  by  the  Bayesian 
approach  (tabulated  in  Table  3,  Section  VI).   An  important 
result  of  the  discriminant  analysis  program  was  the  ordering 
of  variables  and  their  associated  F-statistics  which  may 
be  viewed  as  an  ordering  of  the  relative  diagnostic  impor- 
tance of  the  tests- (see  Table  4,  Section  VI).   This  method 
of  assigning  weighting  factors  to  test  results  in  conjunc- 
tion with  discriminant  analysis  is  a  valid  procedure  for 
ordering  the  vector  of  tests  in  their  diagnostic  importance. 
It  provides  a  means  for  a  doctor  at  decision  node  D,  (of 
Figure  1)  to  determine  which  test  provides  the  most  additional 
information  I  '  from  those  available  to  him.   Additionally, 
the  method  is  particularly  valuable  and  easily  adapted  to 
considering  new  indicators  of  disease  where  no  definitive 
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clinical  judgment  exists  or  doctors  do  not  agree  on  the 
relative  importance  of  test  results. 

The  Bayes '  diagnostic  model  (Section  IV. B.)  was 
developed  to  provide  a  systematic,  quantitative  procedure 
for  aiding  in  the  diagnosis  of  CHD.   It  was  evaluated  by 
checking  how  well  it  diagnosed  patients  from  a  known  CHD 
group  and  a  known  healthy  group.   The  difficulty  in  obtain- 
ing patients  with  all  the  required  test  results  was  noted 
in  Section  V  and  resulted  in  extremely  small  samples  with 
complete  data  to  investigate.   However,  six  out  of  seven 
of  the  CHD  group  were  diagnosed  correctly,  and  33  out  of  33 
of  the  no  CHD  group  were  diagnosed  correctly.   When  only 
eight  or  more  of  the  test  results  were  available,  the  model 
diagnosed  with  91%  accuracy  (41  out  of  50  in  the  CHD  group 
were  diagnosed  correctly  and  52  out  of  52  of  the  no  CHD 
group  were  diagnosed  correctly) .   These  results  were  based 
on  using  a  posterior  probability  of  disease  of  .50  as  the 
cut-off  probability  (i.e.,  P(D|S1,...,S  )  >_  .50  indicated 
CHD) .   The  variation  of  the  cut-off  probability  (see 
Section  VI)  demonstrated  that  the  diagnostic  accuracy 
of  the  model  was  greatly  influenced  by  the  choice  of  the 
cut-off  criterion.   For  example,  using  a  cut-off  of  .20 
instead  of  .50  reduced  the  number  of  false  negatives  from 
nine  to  four  while  the  number  of  false  positives  remained 
the  same. 

As  a  validation  of  the  Bayes'  diagnostic  model,  14 
known  CHD  patients  from  Walter  Reed  Hospital  were  diagnosed 
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by  the  model.  Twelve  of  the  14  were  diagnosed  correctly. 
The  validation  is  not  conclusive  because  of  the  extremely 
small  sample  tested,  but  it  does  indicate  that  the  method 
is  promising. 

It  may  be  desirable  to  use  the  methods  presented  in 
a  screening  program  to  identify  people  with  high  risk  of 
CHD  from  a  large  population.   Sufficient  doctors  may  not 
be  available  to  examine  all  of  the  people  to  be  tested.   As 
an  example  of  the  model's  applicability  to  such  a  screening 
program  (where  a  doctor  is  not  required)  diagnostic  accuracy 
was  investigated  using  the  results  of  the  information  avail- 
able only  [referred  to  in  Figure  1  as  I   and  in  Table  1 
as  (A)].   The  model  diagnosed  with  921  accuracy  (19  out  of 
24  in  the  CHD  group  were  diagnosed  correctly  and  44  out  of 
44  in  the  no  CHD  group  were  diagnosed  correctly) . 

The  Bayesian  diagnostic  model  had  a  high  degree  of 
accuracy  in  correct  diagnoses.   It  is  easily  implemented 
and  appears  to  be  well  adapted  to  screening  studies  where 
a  large  population  is  involved.   The  model  continuously 
updates  the  available  patient  information  from  which  the 
conditional  probabilities  are  calculated  and  may  be  useful 
in  indicating  trends  or  fluctuations  in  the  indicators  of 
disease . 


29 


VIII.   AREAS  FOR  FUTURE  STUDY 

As  pointed  out  previously  (see  Section  IV. A) ,  one  of 
the  main  advantages  of  the  approach  followed  in  the  paper  is 
the  easy  expansion  of  the  number  of  variables  and  the  number 
of  patients  to  be  tested.   This  implies  that  as  the  number 
of  variables  is  increased,  the  diagnosis  of  CHD  will  improve 
The  expanded  list  of  variables  could  also  be  used  to  pre- 
dict other  diseases.   Instead  of  a  space  of  CHD  and  no 
CHD,  there  is  a  space  of  CHD  plus  other  diseases  limited 
only  by  logical  considerations  such  as  the  time,  money, 
availability  of  computational  equipment,  etc.   The  integra- 
tion of  this  expanded  prediction  model  into  routine  physical 
examinations  and  patient  history  could  allow  preliminary 
diagnosis  prior  to  consultations  with  doctors,  helping  to 
reduce  costs  and  the  increasing  patient  load  of  doctors. 

As  presently  modeled,  diagnosis  is  based  on  results  of 
samples  from  diseased  and  non-diseased  groups.   However, 
as  more  samples  are  obtained  and  a  history  of  the  patient's 
variables  (i.e.,  changes  in  blood  pressure  over  several 
years)  is  made,  the  model  could  be  modified  to  diagnose  on 
the  basis  of  change  in  a  patient's  variables  rather  than  by 
comparison  with  a  norm.   This  would  improve  diagnosis  among 
persons  suffering  from  one  disease  where  the  diagnosis  is 
being  complicated  by  the  existence  of  another  disease. 

The  extension  of  the  model  to  include  the  diagnosis  of 
women  would  require  only  a  change  in  the  prior  probability 
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to  include  a  test  for  sex.   Additionally,  a  statistical 
check  of  the  indicators  would  be  necessary  to  determine  if 
a  new  data  base  including  women  would  be  necessary  if 
women  were  to  be  tested. 

Once  a  person  has  been  found  to  have  CHD,  a  system  to 
monitor  his  progress  under  dieting  and  exercise  control 
could  be  developed  from  the  present  model.   This  could 
allow  a  technician  rather  than  a  doctor  to  periodically 
check  the  patient's  indicators. 

The  definitions  used  for  positive  tests  throughout 
this  study  were  based  on  current  information.   Both  a 
statistical  and  medical  investigation  in  this  area  to 
better  define  test  results  could  greatly  improve  future 
models  developed  on  the  same  principles. 

A  model  to  predict  the  cost  of  implementing  and 
operating  the  proposed  diagnostic  model  should  be  explored 
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APPENDIX  A:   SAMPLE  DATA  COLLECTION  SHEET 


Name 

Date 

RACE: 

CAU 

NEG 

MON 

Sex 

Height 
Weight 

Blood  Pressure 

Age 

Blood  Type 

Family  History:   Any  of  the  following  diagnosed  heart  diseases 

(circle) 

Father      Uncle 

Mother      Brother      Unknown      None 

Aunt        Sister 

Any  of  the  following  died  of  heart  disease  (circle) 

Father      Uncle 

Mother      Brother      Unknown      None 

Aunt        Sister 

Cigarette  smoking  in  excess  of  one  year?   Yes No 

If  yes:   less  than  1/2  pack  per  day 
one  pack  per  day 
more  than  one  pack  per  day 

History  of  Ischemic  episodes: 

Chest  pain,  undetermined  origin 

Typical  angina 

None 

Resting  EKG: 

Normal 

ST-T  abnormalities 

Pathologic  Q  waves 

Other 

Exercise  EKG: 

Neg 

ST  depression  greater  than  1  mm 
ST  depression  greater  than  2  mm 
ST  elevation 

Triglycerides 

Cholesterol 


Max  Heart  Rate  Attained  during  Exercise  Test 
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APPENDIX  B 
BAYES'  DIAGNOSTIC  MODEL,  FORTRAN  FLOW  CHART 
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Read  in 
Titles 

Probabilities 
Patient's  Test 
Results 


Yes 


•»© 


No 


♦© 


Calculate  New 
Probability 
of  Disease 
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Check  for  11 
more  variables  as 
above.   Calculate 
new  probability  of 
disease  for  each 


Print  the 
Patient  * s 
Results 
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APPENDIX  C:   BAYES •  DIAGNOSTIC  MODEL  FORTRAN  PROGRAM  LISTING 
BAYE'S  DIAGNOSTIC  MODEL  FOR  CORONARY  HEART  DISEASE 


RFAL*8B,C,D,E,F,G,S,T,U,V,BC,CD,DE,EF,FG,GS,ST,TU,UV, 
1W,XY,YZ,ZY,YX,ZF,ZW,ZM,ZB,ZH,Z0,ZG,ZN,ZC,ZD,VW,A,ZA 

READ  IN  TITLES  FOR  OUTPUT 

RFAD(5,99)A,B,C,C,E,F,G,S,T,U,V,W,BC,CD,DE,EF,FG,GS, 
lST,TU,UV,VW,XY.YZ,ZYtYXtZF,ZWtZM,ZA,ZB»ZH,ZC,ZG,ZN,ZC, 
1ZD 

99    FORMAT(10A8) 

READ    IN    PROBABILITIES    OF    SYMPTOMS    GIVEN    NO    CHD 

CODE    FOR    INPUT    OF     PROBABILITIES    OF    SYMPTOMS    GIVEN 
NO    CHD     (AND    CHD) 
1ST    LETTER 

P-PROSABILI  TY    OF 
2ND    AND    3RD    LETTER 

BD-DIASTCLIC    PRESSURE 

BS-SYSTOLi:     PRESSURE 

BT-BLOOC    TYPE 

CI-CIGARETTE    HABITS 

CH-CHOLESTEROL 

EE-EXERCISE     EKG 

FN-FAKI LY    HI  STORY 

HE-HISTORY    OF     ISCHEMIA 

RE-RESTING    EKG 

RN-    RACE    NEGRO 

RC-RACE  CAUCASIAN 

RM-RACE  MONGOLIAN 

TY-TR1GLYCERIDE 
4TH  LETTER  IN  A  FOUR  LETTER  CODE 

D-CHD 

N-NO  CHD 
4TH  LETTER  IN  A  FIVE  LETTER  CODE 

A-ABNORMAL  WHEN  PRECEDED  BY  RE,  TY,  CH,  BS,  OR  BD 

A-ANGINA  WHEN  PRECEDED  BY  HE 

A-BLOCD  TYPE  A  WHEN  PRECEDED  BY  3T 

G-GREATER  THAN  1MM  DEPRESSION  WHEN  PRECEDED  BY  EE 

G-GREATER  THAN  1  WHEN  PRECEDED  BY  CI 

H-l/2  PACK 

N-NEGATIVE  OR  NCKE 

O-BLOOD    TYPE    C,     A3    OR    B    WHEN    PRECEDED    BY    BT 

0-1     PACK    WHEN    PRECEDED    3Y    CI 

0- OTHER    WHEN    PRECEDED    BY    HE 

P-PAIN    UND.     ORIGIN    WHEN    PRECEDED    BY    HE 

P-POSITIVE 

O-PATH.     0    WAVES 
5TH    LETTER 

D-CHD 

N-NO    CHD 

READC5,100)PRNN,PRCN,PRMN,PBTAN,PBTON, PFNPN,PFNNN, 
1PCIHN,PCI0N,PCIGN,PCI NN , PHEPN , PHE AN, PHENN, PRENN,  PRE AN, 
1  PR  EON, PREON,PEENN,PEEON,PEEGN,PTYNN,PTYAN,PCHAN,PCHNN, 
1PBSAN,PBSNN,PBDAN,P3DNN,PDA 

100  FORMAT! 8F10.6) 

READ    IN    PROBABILITIES     OF    SYMPTOMS    GIVEN    CHD 

RFAD(5, 101 )PRND,PRCD,PRMD,PBTAD,PBTOD,PFNPC,PFNND, 
1PCIHD,PCI0D,PCIGD,PCIND,PHEPC,PHEAC,PHEND,PREND,PREAD, 
1PRE0D,PRECD,PEEND,PEE0D,PEEGC,PTYND,PTYAD, PCHAD,PCHND, 
1PBSAD,P8SND,P3DAD,PBDND 

101  F0RMAT(8F10.6) 
1  =  0.0 
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READ     IN    PATIENT'S    TEST     RESULTS 


CODE    FOR    INPUT 
INDIC-    NOT 
R-RACE    COL. 
1-NEGRO 
2-CAUCA 
3-M0NG0 
Z-    NOT    USED 
BS-SYSTOLIC 
BD-DIASTOLI 
BT-BLOOC    TY 
1-A 
2-0 
3-B 
4-AB 
FN-FAMILY    H 
1-POSIT 
2-NEGAT 
CI-CIGARETT 
1-1/2    P 
2-ONE    P 
3-GREAT 
4-NDNE 
HIE-HISTORY 
1-PAIN 
2-ANGIN 
3-NONE 
RE-RESTING 
1-NCRMA 
2-A3NCR 
3-0-WAV 
4-OTHER 
EE-EXERCISE 
1 -NOR MA 
2-LESS 
3-1/2    T 
4-GREAT 
TRY-TRIGLYC 
1 -NOR MA 
2-ABNGR 
CHL-    CHOLES 
AGE-AGE    COL 


OF    PATIENT'S    TEST    RESULTS 
USED    COL.l 
6 

SI  AN 
LI  AN 

COLS.  9-11 

PRESSURE  COLS.  14-16  NUMERICAL  VALUE 
C  PRESSURE  COLS.  19-21  NUMERICAL  VALUE 
PE  COL.  24 


I  STORY  COL.  27 

IVE 

IVE 

E  HABITS  COL. 30 

ACK 

ACK 

ER  THAN  ONE 

OF  I SCHEMI A  COL.  33 
OF  UNDETERMINED  ORIGIN 
A 

FKG  COL.  36 
L  , 

MAL  S-T  SEGMENT 

ES 

EKG  COL.  39 
L 

THAN  1/2  MM 
0  1  MM  DEPRESSION 

ER  THAN  1MM  DEPRESSION  OR  A  ST  ELEVATION 
ER1DE  COL.  42 
L 

MAL 

TEROL    COLS.     45-47    NUMERICAL    VALUE 
S. 50-51    NUMERICAL    VALUE 


50    READ(5.103HNDICtR,ZtBStBD,BT»FN,CI  , HIE. RE  ,EE,TRY,CHL. 
1  AGE 
103    FOP MAT ( Alt 4X. F 1.0. 2 X,F3.0,2X,F3 •0.2X.F3.0, 2X.F1.0, 2X. 
lF1.0t2X,F1.0,2X.F1.0,2X,FltO,2X,F1.0.2X.F1.0,2X,F3.0. 
12X.F2.0) 

TESTS 


ZERO  C 

OUNT 

OF 

MI 

SSING 

MAGE  =  0 

MR=0 

MBT=0 

MFN  =  0 

MCI  =  0 

MhE  =  0 

MRE=0 

MEE  =  0 

MTY=0 

MCL  =  0 

MBS  =  0 

MBD  =  0 

J  =  0 

1  =  1  +  1 

CHECK  FOR  LAST  CARD 
IF(R.E0.9)GC  TO  5000 
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CHECK  FOR  AGE 

IF(AGE.EO.O.O)GO  TO  200 

MAKE  FIRST  CHECK  FOR  AGE  GROUP,  LESS  THAN  34,  34  TO 

44*  OVER  44 

AFTER  DETERMINATION1  ASSIGN  PRICR  PROBABILITY  OF  CHD 

IF( AGE.GT.34)G0  TO  120 

PD=.01 

GO  TO  190 
120  IF( AGE.GT.44)G0  TO  130 

PD=.04 

GO  TO  190 
130  PD=.07 

190  WRITF(6,191  )I 

191  FORMAT(  '0«  ,/,23X,  "SUBJECT    #',2X,I3) 
GC    TO    201 

200  PD=PDA 
MAGE=MAGE+1 

CHECK  TO  DETERMINE  RACE,  RECALCULATE  PROBABILITY  OF 
CHD 

201  IF(R.EO.0.0)GO  TO  301 

IF(R.E0.1.0)PDR=  (PRCD*PD)/(  ( PRCD*PD) +  ( PRCN*  (  1.0-PD)  )  J 
IF(R.EQ.2.3)PDR=(PRND*PD)/( ( PRND^PD ) + ( P RNN* ( 1.0-PD) ) ) 
IF(R.EQ.3.0)PDR=(PRMD*PD)/{ (  PRMD*PD  )  +  (  PRMNM  1  .  O-PD) ) ) 
GO    TO    302 

301  PDR=PD 
MR=MR+1 

CHECK  TO  DETERMINE  BLOOD  TYPE,  RECALCULATE  PROBABILITY 
CHD 

302  IF(BT.E0.3.3)G0  TO  431 
IF(BT.F0.1.0)PDBT=( PBTAD*PDR)/( (PBTAD-PDRJ + 

1  (PBTAN*  {  1.0-PD R  )  )  ) 

IF(BT.E0.2.0)PDBT={ PBTOD*PDR)/ ( CPBTOD*PDRJ + 
HP3T0NM 1.0-PDR) ) ) 

IFCBT.F0.3  .0 )PDBT=(PBTOD*PDR)/(  (PBTOD+PDR)  + 
1<PBT0N*( 1. O-PDRJ ) ) 

IF(BT.EQ.4.0)P0B7=(PBT0D*PDR)/(  (PBTOD*PDR)+ 
1  (PBTON* (1.0-PDR)  )  ) 

GO    TO    402 

401  PCBT=PDR 
NBT=MBT+1 

CHECK  TO  DETERMINE  FAMILY  HISTORY,  RECALCULATE  PROB- 
ABILITY OF  CHD 

402  IF(FN.E0.0.0)G0  TO  501 
IF(FN.E0.1.0)PDFN={ P FNPD*PDBT ) / ( ( P FNPD*PDB7 ) * 

1(PFNPN*( 1. O-PDBT) ) ) 

IF(FN.E0.2.0)PDFN=(  PFNNDV'P  DBT)  /  (  (  PFNND*PDBT)  + 
HPFNNN*  (1.  0-PD8T))  ) 

GO    TO    502 

501  PDFN=PDBT 
NFN=MFN+1 

CHECK  TO  DETERMINE  CIGARETTE  HAB I TS ,RECALC ULATE  PROB- 
ABILITY OF  CHD 

502  IF(CI.EO.0.0)GO  TO  601 
IF(CI.FQ.1.0)PDCI  =  ( PCIHD*PDFN)/ ( ( PC  I HD*PDFN  )  + 

1(PCIHN*( 1. O-PDFN) ) ) 

IF(CI.EQ.2.0)PDCI  =  (PCI0D=-'PDFN)  /(  (  PC  IOD*PDF  N  )  + 
1 (PCI  ON* (1. O-PDFN) )  ) 

IF(CI.F0.3.0)PDCI=( PCIGD*PDFN) /( (PCIGD* PDFN)+ 
1 (PCIGN- ( 1. O-PDFN ) J ) 

IF(CI.E0.4.0)PDCI=( PCIND*PDFN)/ ((PCIND*PDFNJ+ 
1<PCINN*( 1. O-PDFN) ) ) 
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GO  TO  602 

601  PDCI=PDFN 
MCI=MCI+1 

CHECK    TO    DETERMINE    HISTORY    OF     ISCHEMIC    EP I SODESt RECAL- 
CULATE   PROBABILITY    OF    CHD 

602  IFfHIE.EQ. J.O) GO    TO    701 

IF(HIE.EO.l.O)  PDHE=(PHEPD*PDCI  )  /(  {  PHEPD*  PDCI  )  + 
1 (PHEPN*(1.0-PDCI  )  )  ) 

IF (HIE. EO. 2.0) PDHE=(PHEAD*PDCI )/( ( PHE AD* PDC  I  J  + 
1 (PHEAN--M 1.0-PDCI  )  )  ) 

IF(HIE.EC.3.0)PDHE=(PHZND*PDCI )/( ( PHEND*PDC  I  )  + 
1CP'HENN*(  1.  O-PDCI  )  J  ) 

GO    TO    702 

701  PDHE=PDCI 
MHE=MHE+1 

CHECK    TO    DETERMINE    RESTING    EKG    RESULTS, RECALCULATE 
PROBABILITY    OF     CHD 

702  IF(RE. EO.0.0 JGO    TO    801 
IF(RE.EQ.1.0)PDRE=( PREND*PDHE)/ ( ( PR END* PDH E ) + 

KPREMN-M  1.0-POHE)  )  J 

IF1RE.E0.2.0 )PDRE=(PREAD^PDHE}/ ( ( PR EAD*PDHE  )  + 
l(PREAN--(  1.0-PDHE))  ) 

IF(RE.EQ.3.0)PDRE=(PREQD*PDHE)/< ( PREQD* PDH E J + 
1  (PREQN-'--  (1.0-PDHE)  )  ) 

IF(RE.E0.4.0)PDRE=( PREOD* PDHE ) /( ( PREOD* PDHE ) + 
1 (PREON-(l.O-PDHE) ) ) 

GO    TO    802 

801  PDRE=PDHE 
MRE=MRE+1 

CHECK  TO  DETERMINE  EXERCISE  EKG  RESULTS,  RECALCULATE 
PROBABILITY  OF  CHD 

802  IF(EE.EO.0.0)GO    TO    901 
IF(EE.EQ.1.0)PDEE=(PEEND*PDRE)/((PEEND*PDRE)+ 

1  (PEENN*--  (1.0-PDRE)  )  ) 

IF(EE.E0.2.0)PDEE=( PEE  OOSPORE ) / ( ( PEEOD* PDRE )  + 
1 (PEEONM 1.0-PDRE  )  )  ) 

IF(EE.GE.3.0)PCEE=(  PEEGD=r-PDRE  )/  UPEEGD*PDRE)  + 
1  (PEEGNM  1.0-PDRE)  )  ) 

GO    TO    902 

901  PDEE=PDRE 
MEE=MEE+1 

CHECK  TO  DETERMINE  TRIGLYCERIDES  RESULTS,  PECALCULATE 
PROBABILITY  OF  CHD 

902  IF(TRY. EO.0.0) GO  TO  925 

IF  (TRY.  EG).  1.0)PDTY=(PTYND*PDEE)  /(  (  PTYND*  PDEE  )  + 
1 (PTYNN*( 1.0-PDEE)  )  ) 

IF(TRY.GE.2.0)PDTY=(PTYAD*PDEE)/((PTYAD*PDEE)+ 
l(OTYAN*( 1.0-PDEE)  )  ) 

GC  TO  926 

925  PDTY=PDEE 
MTY=MTY+1 

CHECK    TO    DETERMINE    CHOLESTEROL    RESULTS,     RECALCULATE 
PROBABILITY    OF    CHD 

926  IF(CHL. EO.0.0) GC    TO    95  0 
IMAGE. GT. 29.0  )G0    TO    930 
IF(CHL.GT.240JPDCL=(PCHAD*PDTY)/( (PCHAD*PDTY)+ 

1(PCHAN*( 1. O-PDTY) ) ) 

IF(CHL.LE.240)PDCL=(PCHND*PDTY)/( ( PCHND* PDTY) + 
1 ( PCHNN*(1. O-PDTY) ) ) 
GO  TO  951 
930  IF(AGE.GT. 39.0 JGO  TO  935 

IF(CHL.GT.2  7  0) PDCL= (PCHAD*PDTY )/ ( ( PCHAD*PDTY ) + 
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1(PCHAN*( 1. C-PDTY) ) ) 

IFCCHL.LE.270)PDCL=(PCHND*PDTYJ/{ ( PCHND*PDTY)+ 
1  (PCHNN*  (1.  D-PDTY)  )  ) 

GO    TO    951 
935     IF(AGE.GT.49.0 )GO    TO   945 

IF(CHL.GT.310) PDCL = ( PCH AD* PDTY ) / ( ( PCHAD*PDTY ) + 
1  (PCHAN*( l.O-PDTY) )  ) 

IFCCHL.LE.310) PDCL= ( PCHND* PDTY ) / ( ( PCHND*PDTY ) + 
lCPCHNN-"'(  1.  0-PDTY))  ) 

GO    TO    951 
945     IF(CHL.GT.330) PDCL= ( PCHAD*PDTY )/ ( ( PCHAD*PDTY )+ 
1(PCHAN*( 1. 0-PDTY) ) ) 

IF(CHL.LE.330)PDCL=(PCHND*PDTY)/(  (  PCHND-J<PDTY  )  + 
HPCHNN*  (1.  0-PDTY))  ) 

GO    TO    951 

950  PDCL=PDTY 
MCL=MCL+1 

CHECK    TO    DETERMINE    SYSTOLIC    BLOOD    PRESSURE,    RECALCU- 
LATE   PROBABILITY    OF    CHD 

951  IF(BS.EO.O.O)GO    TO    975 
IF(BS.GE.140)PDBS=( PBSAD*PDCL)/ ( ( P BS AD-PDCL ) + 

1 (PBSAN*( 1.0-PDCL ) ) ) 

IF(BS.LT.1401PDBS=(PBSND*PDCL)/( ( P3SND*PDCL )+ 
1(PBSNN*C 1.0-PDCL) ) ) 

GO    TO    976 

975  PDBS=PDCL 
MBS=MBS+1 

CHECK  TO  DETERMINE  DIASTOLIC  BLOOD  PRESSURE,  RECALCU- 
LATE PROBABILITY  OF  CHD 

976  IF(BD.EQ.O.O)GO    TO    990 
IF(BD.GT.93)PD8D=(PBDAD*PDBS)/((PBDAD*PDBS)+ 

1  IPBDAN*( 1.0-PDBS) )  ) 

IF(BD.LE.9  0)PD3D={P3DND*PDBS)/<  ( PBDND^PDBS  )  + 
L(P8DNN*(  1. J-PDBS) )  ) 

GO    TO    991 

990  PD3D=PD3S 
MBD=MBD+1 

FORMAT  OF  OUTPUT  INSTRUCTIONS 

991  WRITEC6, 936)1 

986  FORMATCO',  3X,'THE  FOLLOWING  INFORMATION  IS  MISSING', 
1*  ON  SUBJECT  #«  ,2X,I3) 
IF(MAGE.EO.O)GO  TO  993 
WRITE(6,992)VW  ■ 

992  F0RMAT(26X.A8) 
GO  TO  994 

993  J=J+1 

994  IF(MR.EG,0)GO  TO  9  96 
WRITE(6,992)A 

GO  TO  997 

996  J=J+1 

997  IF(MBT.E0.0)GC  TO  998 
WRITE(6,995) B,C 

995  F0RMAT(26X,A8, A8) 
GO  TO  999 

999  IF(MFN.EO.O)GO  TO  1000 

WRITE(6,995)D, E 
GO  TO  1001 

1000  J=J+1 

1001  IFCMCI .EO-CJGO  TO  1002 
HRITE(6,995)F,G 

GO  TO  1003 

1002  J=J+1 

1003  IF(MHE.EQ.O)GO  TO  1C04 
WRITE(6,995)S,  T 

GO  TO  10  05 
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1004 
1005 


1006 
10C7 


10C8 
10  09 


1010 
1011 


1012 
1013 


1014 
1015 


1016 
1017 

1018 
1020 
1021 


1022 
1024 


1025 
1026 


1028 
1029 


1030 

1031 

1032 

1033 

1034 

1035 

1036 
1037 

1038 


J  =  J  +  1 

IF(MRE.E 

WRITE( 6, 

GO  TO  1G 

J  =  J  +  1 

IF(MEE.E 

WRITEI6, 

GG  TO  10 

J  =  J  +  1 

IF(MTY.E 

WRITE (6, 

GO  TO  10 

J  =  J  +  1 

IFCMCL.E 

WRITE(6, 

GO  TO  10 

J  =  J+1 

IF(MBS.E 

WRITE(6, 

GO  TO  10 

J  =  J  +  1 

IFfMBD.E 

WRITE* 6, 

GO  TO  10 

J  =  J+1 

IF( J.LT. 

WRITE (6, 

FORMAT ( « 

W  R  I T  E  (  6  , 

FORMAT! ■ 

2X,F8.6, 

•  RESULT 

IF(MAGE. 

WRITE (6, 

FORMAT (1 

IFIMR.EO 

IFCR.EO. 

IF (R.EQ. 

IF (R.EQ. 

FORMAT {1 

IF(MBT.E 

IFCBT.EO 

IFCBT.GT 

FORMAT (  1 

IFCMFN.E 

IFfFN.EO 

IFCFN.EO 

I F ( MC I  .  E 

IFCCI.EO 

IFCCI.EO 

IF(CI.EO 

IFCCI.EO 

IFCMHE.E 

IFCHIE.E 

IF(HIE.E 

IFIHIE.E 

IF(MRE.E 

IF(RE.EO 

IFCRF.GE 

I  Ft  MEE.E 

IF(EE.E0 

I F  {  E  E  .  G  E 

IFCMTY.5 

IFCTRY.E 

IF (TRY.G 

IFCMCL.E 

WRITE16, 

FORMAT! 1 

IF (MBS. E 

WRITE(6  , 

IFCMBD.E 

WRITE(6, 


0.0)G0  TO 
995JU,  V 
07 

O.OJGC  TO 
995JW, BC 
09 

O.OJGC  TO 
995JCD.CE 
11 

O.OJGC  TO 
995JEF ,FG 
13 


0.0  J  GO 
995JGS 
15 


TO 
ST 


Q.OJGO  TO 
995)TU,UV 
17 


1006 


1008 


1010 


1012 


1014 


1016 


12  J  GO 
1318) 

0' ,28X 
1021  J  I 
0' ,4X, 

2X, 'OF 
S  IN  T 
E0.1JG 
10  22  )V 
8  X  i  m  &  , 
.1  JGO 
1.0  J WR 
2.0)  WR 
3  .  0  J  WR 
8X.A8, 
0.1  J  GO 
.  1  .  0  )  W 
.  1 .  0  )  W 
8X,A8, 
0  .  1 )  GO 
.  1  .  0  }  W 
.2.0) W 
0.1 JGO 
.  1  .  0 )  W 
.2.0 JW 
.3.0  )W 
.4.0JW 
0 . 1 ) GO 
0.1.3) 
0.2.0J 
0.3.0) 
0.1 JGO 
.  1  .  0  )  W 
.2.0  J W 
0. 1JG0 
.l.OJW 
.2.0JW 
0. 1JGD 
0.1. 03 
E.2.3) 
Q  .  1  )  GO 
1036  )E 
8X,A8, 
0 .  1 )  GO 
1036JG 
0.  DGO 
1036JT 


TO  1020 


,  'NO 
,PCB 
•  SUB 

COR 
HE  F 
0  TO 
W,  AG 
13X, 
TO  1 
ITEj 
ITE( 
ITE( 
13X, 

TO 
RITE 
RITE 
A8,5 

TO 
RITE 
RITE 

TO 
RITE 
RITE 
RITE 
RITE 

TO 
W  R  I  T 
WRIT 
WRIT 

TO 
RITE 
RITE 

TC 
RITE 
RITE 

TO 
WRIT 
W  R  I T 

TO 
F,FG 
A8,5 

TO 
S,ST 

TO 

u,uv 


NE«  J 

p 

JECT 
ONAR 
OLLO 
102 
E 

F3.0 
026 
6,10 
6  tiO 
6,10 
A8) 
1329 
(6,  1 
(6,1 
X,A8 
1C30 
(6,1 
(6,1 
1031 
(6,1 
(6,1 
(6,1 
(6,1 
1032 
5(6, 
E(6, 
E(6, 
1033 
(6,1 
(6,1 
1034 
(6,  1 
(6.1 
1035 
E(6, 
E(6, 
1037 
,CHL 
X,F4 
1038 
,3S 
50 
,BD 


yi/i 


IX, 13, IX,  'HA! 
«  ,5X, 'HEART 


WING  TESTS' ) 


PRCBABILITY'  , 

DI SEASE  GIVEN' , 


25 JA,ZW 
25)A,ZF 
25)A,ZM 


028JB,C,ZA 
028)B,C, ZB 

J 

328)D,E,ZY 
028JD,E,YX 

028J  F,G,ZH 
028)F,G,Z0 
028J  F,G, ZG 
028) F,G,ZN 

1028 )S,T,ZC 
1028)S,T,Z0 
1028 JS,T,ZN 

028)U,V,XY 
028)U,V, YZ 

02  8)W,3C,XY 
028JW,BC,YZ 

1028JCD,DE,XY 
1028  JCD,DE,YZ 


.0) 
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GO    TO    50 
5000    STOP 
ENO 
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APPENDIX  D 
SAMPLE  OUTPUT 

SUBJECT  #     7 

THE  FOLLOWING  INFORMATION  IS  MISSING  ON  SUBJECT  # 

RACE 

BLOOD  TYPE 
RESTING   EKG 
TRIGLYCERIDE 
CHOLESTEROL 

SUBJECT  #    7  HAS  PRCBABILITY   0.979999   OF  CORONARY 
HEART  DISEASE  GIVEN  RESULTS  IN  THE  FOLLOWING  TESTS 

AGE  44. 

FAMILY  HISTORY  NEGATIVE 

SMOKING  HABITS  NONE 

ISCHEMIA  HISTORY  ANGINA 

EXERCISE  EKG  ABNORMAL 

SYSTOLIC  PRESS.  150. 

DIASTOLIC  PRESS.  100. 


SUBJECT  #  8 

THE  FOLLOWING  INFORMATION  IS  MISSING  ON  SUBJECT  # 

NONE 

SUBJECT  »    8  HAS  PRCBABILITY  0.999946   OF  CORONARY 
HEART  DISEASE  GIVEN  RESULTS  IN  THE  FOLLOWING  TESTS 

AGE  28. 

RACE  WHITE 

BLOOD  TYPE  A 

FAMILY  HISTORY  NEGATIVE 

SMOKING  HABITS  OVER  ONE 

ISCHEMIA  HISTORY  ANGINA 

RESTING   EKG  ABNORMAL 

EXERCISE  EKG  ABNORMAL 

TRIGLYCERIDE  ABNORMAL 

CHOLESTEROL  315. 

SYSTOLIC  PRESS.  118. 

DIASTOLIC  PRESS.  76. 
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